Stochastic Online Anomaly Analysis for Streaming Time Series
نویسندگان
چکیده
Identifying patterns in time series that exhibit anomalous behavior is of increasing importance in many domains, such as financial and Web data analysis. In real applications, time series data often arrive continuously, and usually only a single scan is allowed through the data. Batch learning and retrospective segmentation methods would not be well applicable to such scenarios. In this paper, we present an online nonparametric Bayesian method OLAD for anomaly analysis in streaming time series. Moreover, we develop a novel and efficient online learning approach for the OLAD model based on stochastic gradient descent. The proposed method can effectively learn the underlying dynamics of anomaly-contaminated heavy-tailed time series and identify potential anomalous events. Empirical analysis on real-world datasets demonstrates the effectiveness of our method.
منابع مشابه
Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملReal-Time Anomaly Detection for Streaming Analytics
Much of the worlds data is streaming, time-series data, where anomalies give significant information in critical situations. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, and learn while simultaneously making predictions. We present a novel anomaly detection technique based on an on-line sequence memory algorithm called Hierarch...
متن کامل(Not) Finding Rules in Time Series: A Surprising Result with Implications for Previous and Future Research
Time series data is perhaps the most frequently encountered type of data examined by the data mining community. Clustering is perhaps the most frequently used data mining algorithm, being useful in it’s own right as an exploratory technique, and also as a subroutine in more complex data mining algorithms such as rule discovery, indexing, summarization, anomaly detection, and classification. Giv...
متن کاملMulti-scale streaming anomalies detection for time series
In the class of streaming anomaly detection algorithms for univariate time series, the size of the sliding window over which various statistics are calculated is an important parameter. To address the anomalous variation in the scale of the pseudo-periodicity of time series, we define a streaming multi-scale anomaly score with a streaming PCA over a multi-scale lag-matrix. We define three metho...
متن کاملDynamic characterization and predictability analysis of wind speed and wind power time series in Spain wind farm
The renewable energy resources such as wind power have recently attracted more researchers’ attention. It is mainly due to the aggressive energy consumption, high pollution and cost of fossil fuels. In this era, the future fluctuations of these time series should be predicted to increase the reliability of the power network. In this paper, the dynamic characteristics and short-term predictabili...
متن کامل